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Abstract — Probability transformations provide a method of 
relating Dempster- Shafer sources of evidence to subjective proba- 
bility assignments. These transforms are constructed to facilitate 
decision making over a set of mutually exclusive hypotheses. 
The probability information content (PIC) metric has been 
recently proposed for characterizing the performance of different 
probability transforms. To investigate the applicability of the PIC 
metric, we compare five probability transformations (i.e., BetP , 
PrPl , PrNPl , PrHyb , and DSmP ) using a simulator of human 
responses from cognitive psychology known as two-stage dynamic 
signal detection. Responses were simulated over two tasks: a line 
length discrimination task and a city population size discrimina- 
tion task. Human decision-makers were modeled for these two 
tasks by Pleskac and Busemeyer (2010). Subject decisions and 
confidence assessments were simulated and combined for both 
tasks using Yager’s rule and mapped into subjective probabil- 
ities using the five probability transforms. Receiver operating 
characteristic (ROC) curves, normalized areas under the ROC 
curves (AUCs), along with average PIC values were obtained 
for each probability transform. Our results indicate that higher 
PIC values do not necessarily equate to higher discriminability 
(i.e., higher normalized AUCs) between probability transforms. 
In fact, all five probability transforms exhibited nearly the same 
normalized AUC values. At lower, fixed false alarm rates, the 
BetP , PrPl , PrNPl , and PrHyb transforms yielded higher 
detection rates over the DSmP transform. For higher, fixed false 
alarm rates, the DSmP transform yielded higher detection rates 
over the other four transforms. These trends were observed over 
both tasks, which suggests that the PIC may not be sufficient for 
evaluating the performance of probability transforms. 

Index Terms — Data fusion, Dempster- Shafer theory, Belief 
fusion, Human Simulation, Probability transformations 

I. Introduction 

The Dempster-Shafer theory of beliefs is a popular tool 
in the information fusion community (e.g., [l]-[6]). As op- 
posed to the use of subjective probabilities (i.e., Bayesian 
epistemology [7]), the Dempster-Shafer approach employs a 
normalized measure of evidence (i.e., belief mass assignment) 
on a powerset of alternatives. The result is a method of spec- 
ifying imprecise evidence that results in classes of subjective 
probabilities (i.e., belief and plausibility intervals [8], [9]). 
To facilitate decision making, probability transformations are 
used to generate a subjective probability supported by a given 
belief mass assignment. There exists several Dempster-Shafer 
theory based fusion rules [10], as well as a large number of 
probability transformations (e.g., [1 1]— [13]). These transfor- 
mations are usually evaluated through the use of hypothetical 
examples and by measuring the amount of entropy present 



in the resulting transformed probabilities for a given set of 
evidence (i.e., the probability information content, PIC, as in 
[13], [14]). 

In this study, we simulate the error rates of a fusion 
system for a selection of probability transformations using 
models of human responses (i.e., decision-making, confidence 
assessment, and response time) from cognitive psychology. 
The human response model employed is the two-stage dynamic 
signal detection (2DSD) from [15]. We have used this model 
previously to simulate the performance of fusion combination 
rules over binary decision tasks in [16], [17]. In the current 
study, we use the line length discrimination task and the city 
population size discrimination task that have been previously 
modeled in [15]. For both tasks, human decison-makers were 
positioned at a computer monitor. For the line length discrim- 
ination task, subjects were presented with a pair of lines and 
asked to determine which of the two was longer. For the city 
population size discrimination task, subjects were presented 
with two United States cities and asked to determine which 
of the two had a higher population. For both tasks, subjects 
were asked to provide their confidences in their declarations 
on a subjective probability scale. 

The remainder of the paper is organized as follows. Sec- 
tion II overviews 2DSD, the line length discrimination task, 
and the city population size discrimination task. Section III 
describes the probability transformations investigated here, 
and the relevant Dempster-Shafer terminology. Human re- 
sponses are combined using Yager’s rule [18], after which 
the combined results are transformed into subjective proba- 
bilities using the Pignistic (i.e., BetP) [19], PrPl , PrnPl , 
PrHyb [11], and DSmP [13] probability transformations. 
The framework of our simulation is described in Section IV. 
In Section V, receiver operating characteristic (ROC) curves 
and normalized areas under the ROC curves are estimated after 
applying the five probability transformations investigated here, 
along with the corresponding probability information content 
(PIC) values. 

II. Human Response Simulation 

The human response simulation methodology employed 
here involves the line length discrimination task and the city 
population size discrimination task given in [15]. We provide 
a brief overview of the 2DSD model of human responses and 




the two tasks ( [15] has more information on the parameter 
estimation and validation of the human subjects). 

A. Two-Stage Dynamic Signal Detection [15] 

Let A = {A, A} represent two alternatives on a binary 
decision task. The 2DSD human response model simulates 
internal evidence accumulation for one alternative over the 
other, L(t), using the stochastic linear difference equation, 

AL(t) m 5 At ijf y/At e(t + At), L(0) = L 0 , (1) 

where 5 is the drift rate and e(t) is a simulated white noise 
process with zero mean and variance cr 2 . The parameter cr is 
known as the drift coefficient. The drift rate 5 is positive if 
A is true and negative if A is true. This type of stochastic 
process, known as drift diffusion , is a common model of 
human decision making and response time used in cognitive 
psychology. To make a choice, L(t) is accumulated using 
A L(t) until a threshold, either 6a,—0a’ * s crosse d (where 
To G [— 6 ^ 4 , Oa])- The decision a is then given as 

'A L(t) > 9 A , 

a = < A L(t) < . (2) 

wait otherwise 

Confidence assessment is achieved by waiting an additional 
interjudgment time, r, and binning the final value of L(t). 
Let P A) = • • -PfpJ denote the K a possible confidence 

values when choosing a G A at time The chosen confidence 
level p G for deciding a after waiting t c = td + r is given 
as 

p = p[ a) when L(t c ) e[c\%c\ a) ], (3) 

where = — oo and = °o for each a G A. The 
remaining confidence bin parameters C A) = [c^ • • •c^_ 1 ] 
are chosen such that q_i < c* for each i G {1, . . . , K a — 1}. 

The drift rate 5 and initial condition L 0 can be chosen 
randomly at the beginning of a given simulation to allow for 
decision variability between trials. This randomization of 5 
and Lq is performed in [15] by choosing 5 from a normal 
distribution with mean v and variance p 2 , and choosing Lq 
from a uniform distribution in the range [— 0.5^, 0.5s*]. The 
values v and p are the drift rate mean and standard deviation, 
and s z is the size of the interval that Lq is chosen from. 
To simplify the implementation and parameter estimation of 
2DSD, the authors of [15] suggest the following: 

. Set 0^ = 0 A = 0. 

• Standardiz^possible confidence assessment values (e.g., 

p(A) = p(A) = [Q 5Q) 0 60) . . . 5 1.00]). 

• Fix the confidence interval bins between each alternative 

(i.e., = C^ = C={ Cl ,c 2 ,...,c 5 }). 

• Fix a = 0.1. 

The 2DSD parameter set for a single subject, S becomes 

<5 = {v,r),S z ,d,T,Ci,C2,C3,C4,C 5 }. ( 4 ) 

The ten parameters defined by S can be determined from 
a subject’s decision, confidence, and response time statistics 
using quantile maximum probability estimation [ 20 ]. 



TABLE I 

Relationship between the mean drift rates v of [15] with line 

LENGTH AND CITY POPULATION RANK DIFFERENCES. 



Mean Drift Rate 


Line length difference 


Population rank difference 


Vl 


0.27 mm 


1 - 9 




0.59 mm 


10 - 18 


V3 


1.23 mm 


19 - 29 


V4 


1.87 mm 


30 - 43 




2.51 mm 


44 - 59 


VQ 


3.15 mm 


60 - 99 



B. Overview of Tasks 

We use the line length discrimination task and the city 
population size discrimination task, modeled in [15], as case 
studies. For the line length discrimination task of [15], six 
individuals were asked to compare a pair of horizontal lines 
with different lengths. The two lines were separated by a 20 
millimeter long line. Each pair of lines consisted of a 32.00 
millimeter long line and either a 32.27, 32.59, 33.23, 33.87, 
34.51, or 35.15 millimeter long line. For the city population 
size discrimination task, six individuals were asked to compare 
pairs of the 100 most populated United States cities. Their 
answers were graded based on city population rank estimates 
taken from the 2006 U.S. census [21]. 

For both tasks, subjects were instructed to first make a 
declaration towards which of the two stimuli is larger (i.e. the 
longer line or the more populated city). Immediately thereafter 
the subjects were asked to assess their own confidence in 
that declaration on the probability scale {0.50, 0.60, ..., 1.00}. 
Subject mean drift rates v varied based on task difficulty, as 
shown in Table I. The difficulty of the line length discrimi- 
nation task decreases as the actual length difference between 
the two lines increases. The difficulty of the city population 
size discrimination task decreases as the difference between 
the population ranking of the two cities increases. Separate 
decision thresholds 6 were determined for two cases of each 
task. These where when (1) subjects were asked to focus on 
fast responses; and when ( 2 ) subjects were asked to focus on 
accurate responses. Here we have used 2DSD parameter sets 
for subjects focusing on accurate responses. The values are 
available in [15, Tables 3 and 6 ]. 

III. Probability Transforms and Decision Making 
A. Preliminaries 

The following necessary background on Dempster-Shafer 
theory is taken from [ 8 ] and [10]. Consider the set of mutually 
exclusive alternatives Q and its powerset 2°. A Dempster- 
Shafer approach to information fusion assesses evidence on 
the powerset of alternatives through the use of belief mass as- 
signments (BMAs), m(X), defined for all ICO. BMAs are 
normalized quantities of evidence, such that = 

1. Two additional functions known as Belief Bel(X), and 
Plausibility , P1(X), are defined. They respectively represent 
a minimum and maximum amount of evidence assigned by 




the BMA m(X) on some A' C Q. That is, 

Bel(AT) = E m{Z\ (5) 

zcn 
z cx 

and 

P1(X) = 1 - Bel(Z) = Y m ( Z )> (6) 

zcn 

znx^tt 

where 0 is the empty set. When m(0) = 0, BMAs become 
normalized measures of evidence on the powerset and the 
belief and plausibility measures can be interpreted as bounds 
on potential subjective probabilities which are supported by 
a given BMA. For the remainder of the study, we will 
assume that m(0) = 0. The differences between the belief 
and plausibility bounds represents the amount of imprecise 
evidence given by the BMA (i.e., the mass assigned to the 
non- singleton elements of the powerset). If no belief masses 
are assigned to the non-singleton elements of the powerset, 
the BMA is equivalent to a subjective probability assignment. 

There exists several fusion combination rules in the litera- 
ture that operate using BMAs, each with its own benefits and 
drawbacks [10]. In the current study, we make use of Yager's 
rule for combining belief mass assignments [18]. Given two 
BMAs, mi and m 2 , over the same powerset of alternatives, 
Yager’s rule is given as 

[Yhz\,z 2 <zxi mi(Zi)m2{Z 2 ) X^VL 
m h2 (X) = l Zinz 2 = X , 

X) z 1 ,z 2 cn mi(Zi)m 2 (Z 2 ) + /C X = Q 
v Zinz 2 =x 

(7) 

where /C is defined as the degree of conflict between mi and 
m 2 such that 

/C = ^ m 1 (Z 1 )m 2 (Z 2 ). (8) 

Z\ iZ 2 CO 
ZiHZ 2 =0 

Yager’s rule is commutative, but in general not associative 
[10]. To make a decision, the BMAs produced by Yager’s 
rule must be mapped into a subjective probability assignment 
using a probability transformation. 

B. Pignistic Probability Transform 

The pignistic probability transformation ( BetP ) was first 
proposed by Philippe Smets in [22] and then included in [23] 
as a part of the Transferable Belief Model. The pignistic 
probability transform involves transferring the belief mass 
from each non- singleton element of a BMA to its respective 
singleton elements by dividing its mass equally (i.e., according 
to its cardinality). The pignistic probability can be defined for 
any X C £1 as 

BetP(X) = Y ^Sr m ( Z ^ (9) 

zcn I I 

Z#0 

where | • | is the cardinality of a set. The pignistic probability 
transform satisfies all three Kolmogorov Axioms, and hence it 
only needs to be computed for the singleton elements w G 



C. Sudano Proability Transforms 

Sudano has proposed a suite of five probability transforma- 
tions in [11]: PrPl , PrNPl , PraPl , PrBel , and PrHyb. In 
the current study, we will only focus on the PrNPl , PrPl , 
and PrHyb transforms 1 . These three probability transforms 
are defined for the singleton elements cc G £1 as 

PrPlH = P1M E v m( pL v d°) 

uez 

™ p!M= e £&• <M) 

and 

<12) 

ujEZ 

where PraPl(uj ) = PrPl(uo) for binary £1 We direct the 
reader to [11] or [13] for the full definition of PraPl(uj). 



D. Dezert-Smarandache’s Probability Transform 

A recently proposed probability transformation by Dezert 
and Smarandache, denoted DSmP , is introduced in [13]. 
DSmP distributes belief masses assigned to the non-singleton 
elements of Yl proportionally, according to the belief masses 
assigned to the singleton elements. The transformation is 
defined in [13] using Dedekind lattices (i.e., the hyperpowerset 
of a set of alternatives). It is defined in terms of the powerset 
as 



DSmP e (X) = Y 



E, 



tiexnz 



m(Qj) + e\X fl Z\ 



ZCQ 



J2coez m (^) + e \ z \ 



m(Z), 

(13) 



where e G [0, 00 ] is a tuning parameter. As e — )> 0, DSmP 
approaches Sudano ’s PrBel transform. As e — )> 00 , DSmP 
approaches BetP [13]. The authors of [13] suggest selecting 
a small value for e in order to minimize the amount of entropy 
present in the probabilities resulting from transformation. With 
this in mind, we used here e = 0.001. Similar to BetP , 
DSmP satisfies all three Kolmogorov Axioms and only needs 
to be computed for the singleton elements uj G £2. 



E. Probability Information Content 

The probability information content (PIC) was proposed 
by Sudano in [14] as a method for comparing the perfor- 
mance of various probability transformations. For a subjective 
probability assignment V(ut) genereated by the probability 
transformation V , the PIC is defined as as 

PICvM = 1 + , M Y logV(u) (14) 

where M = |£2|. A lower PIC value represents a subjective 
probability assignment where the alternatives are close to 



^or binary decision tasks, PrPl and PraPl are equivalent [13]. Further- 
more, the later defined DSmP transform provides a more mathematically 
robust definition over PrBel [13]. 




TABLE II 

Average BMA after combination (Hi true, line task). 



Line Length Difference 


m(H 0 ) 


m(-ffi) 


m(Ho U Hi) 


0.27 mm 


0.22 


0.47 


0.31 


0.59 mm 


0.08 


0.69 


0.23 


1.23 mm 


0.01 


0.87 


0.11 


1.87 mm 


0.00 


0.94 


0.05 



TABLE III 

Average PIC for each probability transformation (Hi true, 

LINE TASK). 



Line Length Difference 


BetP 


PrPl 


PrNPl 


PrHyb 


DSmP 


0.27 mm 


0.56 


0.63 


0.51 


0.68 


0.83 


0.59 mm 


0.67 


0.72 


0.64 


0.76 


0.88 


1.23 mm 


0.85 


0.87 


0.84 


0.89 


0.95 


1.87 mm 


0.93 


0.94 


0.92 


0.95 


0.98 



TABLE IV 

Average BMA after combination (Hi true, city task). 



City Rank Difference 


m(H 0 ) 


m(H i) 


m(Ho U Hi) 


1 - 9 


0.25 


0.42 


0.33 


10 - 18 


0.17 


0.53 


0.30 


19 - 29 


0.10 


0.64 


0.26 


30 - 43 


0.06 


0.72 


0.21 



TABLE V 

Average PIC for each probability transformation (Hi true, 

CITY TASK). 



City Rank Difference 


BetP 


PrPl 


PrNPl 


PrHyb 


DSmP 


1 - 9 


0.49 


0.57 


0.43 


0.64 


0.80 


10 - 18 


0.53 


0.61 


0.48 


0.67 


0.82 


19 - 29 


0.60 


0.67 


0.56 


0.73 


0.85 


30 - 43 


0.68 


0.73 


0.64 


0.78 


0.89 



being equiprobable. A higher PIC value represents a subjective 
probability assignment where one of the alternatives is close 
to having probability one. 



IV. Simulation Overview 



The 2DSD parameter sets relating to human responses on 
the line length discrimination task and the city population size 
discrimination task were used to simulate the decision perfor- 
mance of the five probability transformations of Section III 
(i.e., BetP , PrPl , PrNPl , PrHyb , and DSmP). Similar 
to what we have done in [17], a pool of 24 human responses 
were generated for each task using the parameter sets of the six 
subjects given in [15, Tables 3 and 6] by simulating four pairs 
of decisions and confidence assessments from each subject. 
Subjects were simulated over the first four difficulty levels 
given in Table I. For the line length discrimination task, we 
let Hi denote the hypothesis that the second line presented 
to the subject is longer than the first and let Ho denote the 
hypothesis that the first line presented to the subject is longer 
than the second. For the city population size discrimination 
task, we let Hi denote the hypothesis that the second city 
presented to the subject has a higher population than the first 
and let Ho denote the hypothesis that the first city presented 
has a higher population than the second. For each task, 10,000 
trials were conducted for Hi being true and 10,000 trials for 
Ho being true. During each trial, the human responses from 
the subject pools were used to generate BMAs such that 



mi(X) = { 



Pi 

1 ~Pi 

0 



X CL i 

x = n 

otherwise 



(15) 



Here the subject decisions are given as ai G U = {Ho, Hi} 
and confidence assessments as pi G [0, 1] for each subject 
i = 1, . . . , 24. The subject BMAs mi were combined two 
at a time using Yager’s rule, as described by equations (7) 
and (8). Since Yager’s rule is not associative, the combination 
order was randomized by choosing from the subject pool uni- 
formly. The final combined BMAs were then each transformed 



into subjective probability assignments Vi,..., 2 a{') for each 
trial using the five probability transformations, BetP , PrPl , 
PrNPl , PrHyb , and DSmP. 

Using Vi,..., 2 a(Hi), the fused decision ao G {Ho, Hi} was 
determined such that 

„„ = {* * ■“<*> £ A , ,16) 

|i7o otherwise 

where the threshold A is varied in [0, 1] for a desired detection 
and false alarm rate pair. The fused false alarm and detection 
rates are given as 

FAR = P(P lt ... M (H 1 )>X\H 0 ), (17) 

and 

DET = P (Vi_2a(Hi) > A| Hi ) . (18) 

The threshold test of equation (16) was used to estimate false 
alarm rates and detection rates for varying threshold values 
A using the 10,000 simulated responses with Hi true and 
the 10,000 responses with Ho true (after applying Yager’s 
rule and each of the five probability transforms). These sets 
of false alarm and detection rates were then used to create 
ROC curves, and measure the areas under the ROC curves 
(AUCs) for each probability transform. Higher AUC values 
are indicative of higher discriminating performance between 
alternatives (i.e., higher detection rates for the same false alarm 
error rates). For comparison, the combined BMAs and the PIC 
values for each probability transformation were determined 
(averaged over 10,000 trials). 

V. Results 

Tables II and IV show the average BMA for the line length 
discrimination task and the city population size discrimination 
task when Hi is true (i.e., the second line is longer than 
the first, or the second city has a larger population than the 
first). The average BMAs for Ho being true were the same, 
except that the values of m(Ho) and m(Hi) were reversed. 






(a) Line length difference 0.27 mm 



(b) Line length difference 0.59 mm 





(c) Line length difference 1.23 mm (d) Line length difference 1.87 mm 

Fig. 1. Normalized area under the ROC curve (AUC) versus the number of sources present in combination, for each difficulty level of the line length 
discrimination task. Different lines represent the five different probability transforms investigated by this work. Error bars shown for the 95% confidence 
intervals. In each of the four difficulty levels, all five probability transforms are nearly overlapping. 



As expected, Yager’s rule yielded combined BMAs which had 
high levels of imprecision (i.e., m(HoUHi) between 0.05 and 
0.31 for the line length discrimination task and between 0.21 
and 0.33 for the city population size discrimination task). As 
the task becomes easier (i.e., increasing the length difference 
between line pairs or increasing the population rank difference 
between city pairs), more belief mass is placed on the singleton 
elements. Tables III and V show the average PIC values after 
applying each of the five probability transformations to the 
combined BMAs resulting from the line length discrimination 
task and the city population size discrimination task. Higher 
PIC values are usually considered better [13], [14], [24]. The 
results in Tables III and V show increasing PIC values as 
task difficulty decreases, which seems reasonable. For each 
difficulty level, the PIC values follow the same trend with 
PrNPl having the lowest PIC and DSmP having the highest 
PIC. These trends supports the notion that DSmP produces 



subjective probabilities which are the most committed towards 
one of the alternatives (i.e., having the lowest entropy) [13]. 

Figure 1 shows the normalized AUCs versus the number of 
human responses present in combination for all five probability 
transforms on the line length discrimination task. Figure 3 
shows the same quantities for the city population size discrim- 
ination task. Each subplot of figures 1 and 3 shows normalized 
AUCs for the four task difficulty levels simulated here. In all 
cases, the error bars represent the 95% confidence intervals. 
As expected, normalized AUC values increase as the task 
becomes easier. For any given difficulty level however, all five 
probability transformations exhibited statistically insignificant 
differences between normalized AUC values. The overall 
discriminating performance of all probability transforms is in 
fact the same. 

The ROC curves after combination for all five probabil- 
ity transforms are shown in Figure 2 for the line length 




(a) Line length difference 0.27 mm (b) Line length difference 0.59 mm 
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(c) Line length difference 1.23 mm (d) Line length difference 1.87 mm 

Fig. 2. ROC curves for each difficulty level of the line length discrimination task, showing false alarm rates up less than 0.30. Different lines represent the 
five different probability transforms investigated. 



discrimination task and in Figure 4 for the city population 
size task. Each subplot of figures 2 and 4 shows ROCs 
for the four task difficulty levels. The ranges of the graph 
axes correspond to “reasonable” false alarm rates (i.e., up to 
0.30). Again, the overall shape of the ROC for all probability 
transforms improves as the tasks become easier (supporting 
the results shown in Figure 3). For lower false alarm rates 
(e.g., less than 0.07), BetP , PrPI, PrNPI, and PrHyb 
produce similar detection rates, which are all higher than 
those produced by DSmP. For higher false alarm rates (e.g., 
greater than 0.10), DSmP produces higher detection rates 
over the remaining four probability transforms. As false alarm 
rates become even higher (e.g., greater than 0.25), similar 
performance is observed for all probability transforms. For the 
hardest and easiest variations of both tasks, these performance 
gains become less apparent. These observations support the 
conclusions reached by [25]; depending on the acceptable error 
rate for a specific task, a higher PIC value may not necessarily 



indicate higher detection rates. 

VI. Conclusions 

Using models of human responses from cognitive psychol- 
ogy, we have shown that there exist cases when the probability 
information content (PIC) does not always indicate “better” 
probability transforms. In the example presented here, two- 
stage dynamic signal detection (2DSD) was used to show 
that increasing PIC values may not necessarily lead to bet- 
ter discriminability performance of a probability transform. 
Specifically, it was found that all five probability transforms 
(i.e., BetP, PrPI, PrNPI, PrHyb, and DSmP ) yielded 
the same discriminability performance (i.e., normalized AUC 
values) regardless of the number of sources included in 
the combination. Furthermore, the results indicate that some 
probability transforms yield higher detection rates over others, 
depending on the false alarm rate required. For lower false 
alarm rates (e.g., less than 0.07), the BetP, PrPI, PrNPI, 





(a) City rank difference 1-9 



(b) City rank difference 10-18 





(c) City rank difference 19-29 (d) City rank difference 30 - 43 

Fig. 3. Normalized area under the ROC curve (AUC) versus the number of sources present in combination, for each difficulty level of the city population 
size discrimination task. Different lines represent the five different probability transforms investigated by this work. Error bars shown for the 95% confidence 
intervals. In each of the four difficulty levels, all five probability transforms are nearly overlapping. 



and PrHyb transforms yielded higher detection rates than 
DSmP. For higher false alarm rates (e.g., greater than 0.10), 
this trend was reversed. These findings support the arguments 
presented in [25], and suggest that simulation and testing 
should be performed before the components of a specific 
fusion system are selected. 
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